Visualization of Pairwise and Multilocus Linkage Disequilibrium Structure Using Latent Forests
نویسندگان
چکیده
Linkage disequilibrium study represents a major issue in statistical genetics as it plays a fundamental role in gene mapping and helps us to learn more about human history. The linkage disequilibrium complex structure makes its exploratory data analysis essential yet challenging. Visualization methods, such as the triangular heat map implemented in Haploview, provide simple and useful tools to help understand complex genetic patterns, but remain insufficient to fully describe them. Probabilistic graphical models have been widely recognized as a powerful formalism allowing a concise and accurate modeling of dependences between variables. In this paper, we propose a method for short-range, long-range and chromosome-wide linkage disequilibrium visualization using forests of hierarchical latent class models. Thanks to its hierarchical nature, our method is shown to provide a compact view of both pairwise and multilocus linkage disequilibrium spatial structures for the geneticist. Besides, a multilocus linkage disequilibrium measure has been designed to evaluate linkage disequilibrium in hierarchy clusters. To learn the proposed model, a new scalable algorithm is presented. It constrains the dependence scope, relying on physical positions, and is able to deal with more than one hundred thousand single nucleotide polymorphisms. The proposed algorithm is fast and does not require phase genotypic data.
منابع مشابه
Effect of two- and three-locus linkage disequilibrium on the power to detect marker/phenotype associations.
There has been much recent interest in describing the patterns of linkage disequilibrium (LD) along a chromosome. Most empirical studies that have examined this issue have concentrated on LD between collections of pairs of markers and have not considered the joint effect of a group of markers beyond these pairwise connections. Here, we examine many different patterns of LD defined by both pairw...
متن کاملMeasuring and partitioning the high-order linkage disequilibrium by multiple order Markov chains.
A map of the background levels of disequilibrium between nearby markers can be useful for association mapping studies. In order to assess the background levels of linkage disequilibrium (LD), multilocus LD measures are more advantageous than pairwise LD measures because the combined analysis of pairwise LD measures is not adequate to detect simultaneous allele associations among multiple marker...
متن کاملAllozymic Variation and Linkage Disequilibrium in Some Laboratory Populations of DROSOPHILA MELANOGASTER.
Nine laboratory populations of D. melanogaster were surveyed by starch gel electrophoresis for variation at 17 enzyme loci. A single-fly extract could be assayed for all 17 enzymes, so that the data consist of 17-locus genotypes.--Pairwise linkage disequilibria were estimated from the multilocus genotypic frequencies, using both Burrows' and Hill's methods. Large amounts of linkage disequilibri...
متن کاملPedigree disequilibrium tests for multilocus haplotypes.
Association tests of multilocus haplotypes are of interest both in linkage disequilibrium mapping and in candidate gene studies. For case-parent trios, I discuss the extension of existing multilocus methods to include ambiguous haplotypes in tests of models which distinguish between the cis and trans phase. A likelihood-ratio test is proposed, using the expectation-maximization (E-M) algorithm ...
متن کاملLinkage Disequilibrium between Allozymes in Natural Populations of Lodgepole Pine.
Pairwise linkage disequilibrium values (D) were estimated for 14 allozyme loci in two natural populations of lodgepole pine (Pinus contorta ssp. latifolia). Maternal multilocus genotypes were inferred from samples of (haploid) megagametophytic seed-endosperms. Coupling/repulsion double heterozygotes were distinguished for closely linked pairs of loci. Assays of seven of the loci in seed embryos...
متن کامل